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LT James E. Heyman and Jeffery J. Leader 
INTRODUCTION 

There are many applications that require the generation of 
pseudorandom bitstreams, that is, sequences of zeros and ones that 
meet certain criteria. A partial list of these applications 
includes coding, communications, spread spectrum techniques (both 
for frequency allocation and security) , simulations, and security 
access. 

In addition to the typical measures of pseudorandomness, 
applications dealing with communications security impose the 
additional requirement that any attempt to retrieve the method of 
generation should be extremely difficult. For these cases it is 
useful to move beyond the scorecard of specific tests to a "larger" 
definition of a suitable sequence as one in which no amount 
knowledge of previous bits gives an indication as to what the next 
bit will be. 

At least philosophically, it should be clear that any 
repeatable scheme that we as humans can come up with has to be 
deterministic (i.e. nonrandom) and therefore, at some level, 
predictable. It could be that the period is very large but that is 
not the point. The point is that it is logically possible. 

An immediate reaction to the fact that pseudorandom sequences 
are being generated for security related applications is that it 
seems like a good idea to develop methods to attack them. 



Eventually the goal is to be able to accurately predict future bits 
based on previous ones (assuming that through known plaintext or 
some other method we are able to retrieve pure keystream) . Our 
preliminary goal, however, is a bit more modest: To determine the 
nature of the generator that was used to generate a particular 
bitstream. Although modest this aim is far from trivial for while 
there is much documentation concerning how to attack a sequence if 
the type of generator is known there is no obvious way of deciding 
which generator was used except by trial and error. As such, due 
to the prevalence of shift registers, the standard strategy is to 
attempt to simulate the sequence using Berlekamp-Massey (assuming 
a reasonable linear complexity) and if that doesn't work then to 
try something else. As more generation schemes appear this method 
becomes rather tedious and inefficient. For our purposes we 
restricted our investigation to three major generation methods: 
Shift registers, a quadratic planar map, and the linear 
congruential generator. 



METHODOLOGY 

Our proposed method of attack involves the use of a neural 
net. Specifically, we used the BrainMaker neural net software 
package to set up a back propagation model with fixed training and 
testing parameters which was used to test a series of n bits to 
predict one bit with n running from one to 25. It must be noted 
that throughout the experiments all of parameters such as learning 
rate, momentum, test rate, etc. were fixed. This was done to 
demonstrate the viability of the scheme in general (as opposed to 



finding specific solutions to specific coding problems) . For all 
schemes bitstreams of length 1000 were used. 

The basic idea is that for each family of generation schemes 
this process results in a characteristic training and testing curve 
that serves as a "fingerprint" of the process. The implication is 
that given a bitstream from an unknown source this attack will 
result in the determination of the family of generation schemes 
from which it came. 



SHIFT REGISTERS 

The most common scheme for generating pseudorandom bitstreams 
is the linear feedback shift register (LFSR) , the details of which 
can be found in Golomb (1982) . For our immediate purposes all that 
needs to be known is that an appropriately wired n-stage shift 
register will generate a binary sequence of length 2 n -l. This 
results in the favorable result that a fairly small shift register 
with, for example, only 64 stages can produce a sequence that is 
approximately 1.8 x 10 19 bits long. Intuitively it would seem that 
such a sequence could baffle even the most powerful supercomputers 
but it turns out that, by utilizing the Berlekamp-Massey or Zeigler 
algorithm, a LFSR can be fully simulated if only 2n sequential bits 
are known. (This is not a difficult program and can be completed 
in 30 lines of MATLAB code.) Admittedly getting 2n bits of pure 
keystream might pose a problem but it surely is much less a problem 
than dealing with the entire sequence. 

In as much as an effective algorithm exists for attacking 
shift register generated bitstreams it is reasonable to expect that 



any proposed new attack at least be able to match the efficacy of 
the known algorithms. 

The first example is a LFSR defined by the function f(x) = 
x A 13 + x~12 + x A ll + x + 1. This function is primitive over GF(2) 
and as such the resultant sequence will have a full period of 8191. 
The following graph shows the result of applying the neural net in 
the manner described above. 
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As can be seen, the profile is similar to that of the linear 
complexity in the sense that the pattern is quickly determined (at 
n=13) and at that point any extra input bits are ignored. Of 
particular interest is how both the training (solid) and testing 
(dashed) lines remain relatively flat until the length of the shift 
register is reached, at which point both spike up to perfect 
training and testing. It is also noteworthy that once past that 
point that it usually took around 50 data passes to complete the 
training. Another facet is that it is possible to go back into the 
net and isolate which input bits have a direct effect on the 
output. The procedure is as follows: Label the bits as they 
correspond to the stages of an n-long shift register. Then, one by 
one, change the bits from zero to one (or one to zero as the case 
may be) and observe whether a given bit flips the output. If it 



does then the corresponding shift register stage of that bit has a 
direct effect on the output. Upon checking all of the input bits 
this results in the ability to recreate the polynomial associated 
with the actual LFSR that was used. 

We can thus conclude that even though we used more bits than 
the Berlekamp-Massey algorithm would require we were able to 
generate the same results. More work needs to be done on the 
effectiveness of this method as the total number of bits analyzed 
is reduced in relation to the total length of the bitstream. 

This experiment was then repeated on a LFSR based on the 
function f(x) = x A 18 + x A 7 + 1 with the resulting curves displayed 
below. 



f(x) - x~18 + x~7 + 1 




Note that the exact same fingerprint shows up. In conclusion, 
we believe that this combination of a training curve that stays low 



and flat combined with a testing curve that hovers around .5 until 
both jump up to 100% is indicative of the use of a linear shift 
register. In addition, by examining the net at the point of the 
jump the actual function can be retrieved and thus the shift 
register itself can be recovered. 

QUADRATIC MAP 

In previous work, Heyman (1993) , the possibility of using a 
non-linear chaotic system to generate pseudorandom bitstreams was 
investigated. The process basically consists of following an orbit 
on the classic Henon (1976) attractor and assigning a zero or a one 
based on whether the point was on the left cr right side of a 
previously defined median point. Although there were some 
anomalies concerning the "runs" property the overall conclusion was 
that it did generate a reasonably pseudorandom seguence based on 
the larger definition of predictability. Although bitstreams of 
the same length as earlier (1000) were used these represent a 
minuscule percentage of the total bitstream length since this 
generation scheme has an approximate period of at least 10 15 when 
generated on a personal computer. The following two graphs 
summarize the results of the neural net attack on bitstreams 
generated using this method. 
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h*n©r>(1 000,0,0) 




hen©n( 1 000,. 3.. 2) 




Notice that both show a constant testing level of about 75% 
while the training curve merges with the testing curve at n=15 and 
then breaks out above at n=25. This same behavior was observed on 



other Henon bitstreams. 

Inasmuch as the Henon attractor is topologically conjugate to 
a wide class of quadratic mappings of the real plane (all those 
with constant Jacobian) , it is expected that future work which 
applies this method of attack to other quadratic mappings will 
generate similar results. As such, at this point we are confident 
that this fingerprint is indicative of a bitstream which was 
generated with the Henon scheme and we believe that a similar trace 
will be generated by any other planar quadratic mapping (used to 
generate a binary sequence as indicated in Heyman (1993)). 



LINEAR CONGRUENTIAL 

The next generation method used was based on the MATLAB random 
number generator, with the standard conversion from (0,1) to {0,1}. 
What makes the results surprising is that MATLAB uses a basic 
linear congruential generator which generally is not considered 
very sophisticated (although it is, of course, nonlinear; see 
Gillespie (1992)). Be that as it may, and whatever its other 
weaknesses are, as can be seen on the following graph, it does very 
well on this test (in terms of unpredictability) . Nonetheless, it 
does have a distinctive fingerprint and thus the generation method 
can still be identified. 
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The distinguishing features of this fingerprint- are a testing 
curve that hovers near 50% and a training curve that grows slowly 
until the two curves intersect in the range of n equal to 20 to 25. 

It is quite arguable, philosophically or physically, that 
there are no macroscopic random processes. However, for most 
purposes, it is acceptable to consider the flip of a coin as a 
reasonably random event. For completeness, the neural net attack 
was then attempted on a bitstream generated by 1000 coin flips and 
the results follow. 

As expected, the training lines hums right along around 50%. 
This is in keeping with the fact that even a neural net can't make 
progress on a truly random process. However, what is noteworthy is 
that the training line does seem to show a continual improvement. 
Since "progress" is purely a testing term this is not terribly 



1000 coinflips 




important except that it gives a secondary feature of the 
fingerprint, which is similar to the linear congruential generator 
discussed earlier. 



CONCLUSION 

As bitstream generators become more sophisticated and schemes 
involving non-linear and chaotic processes become commonplace the 
traditional linear methods of attack will prove to be inadequate. 
Our proposal of using a neural net addresses this by utilizing it's 
inherent nonlinear workings to attack this nonlinear problem: We 
are fighting fire with fire. The graphs clearly indicate the value 
of this method for finding the generator. We mention that many 
standard attacks are of the "known generator" variety, and in this 
sense the neural net fills in a crucial gap between theory and 
practice by determining that generator. We have intentionally used 



the net in an unsophisticated manner, without varying the net 
parameters, in order to demonstrate the viability of this approach 
in general. The development of a library of fingerprints of known 
generators and of adapted artificial neural networks to identify 
them would certainly appear to be a worthwhile undertaking for the 
cryptana lys t . 

Beyond the idea of generator characterization considered in 
this paper, and as evidenced by the testing results from the Henon 
scheme, we further believe that this method will be effective in 
the actual prediction of bitstreams given some number of bits known 
to be correct. To achieve this, future work will have to include 
adaptive setting of the neural net software as well as the 
investigation of different types of neural nets. However, at the 
very least, we are absolutely convinced that neural nets can play 
a significant, and perhaps dominant, role in the process of 
attacking procedures that depend on pseudorandom bitstreams for 
their security. 
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